Figure E1: Postive control dilution curve and four parameter log-logistic curve fits for Vibrio cholerae O1 antigen multiplex bead assay markers

Each point represents the median fluorescence intensity measurement (y-value) for a dilution (x-value) of pooled convalescent sera from confirmed positive Vibrio cholerae O1 patients. Each plate’s dilution series were fit using a four-parameter log-logisitic regression (black lines). Shape of each point is unique to each plate.

Figure E2: Postive control dilution curve and four parameter log-logistic curve fits for additional antigen multiplex bead assay markers

Each point represents the median fluorescence intensity measurement (y-value) for a dilution (x-value) of pooled convalescent sera from confirmed positive Vibrio cholerae O1 patients. Each plate’s dilution series were fit using a four-parameter log-logisitic regression (black lines). Shape of each point is unique to each plate.

Figure E3: Relationship between relative antibody units and median fluorescence intensity for additional antigen multiplex bead assay markers

Relative antibody unit (RAU) measurements for each sample are plotted against the median fluorescence intensity (MFI) calculated from averaging triplicate measurements. Relative antibody units estimates were truncated at 10^2 and 10^5 (red points).

Figure E4: Relationship between relative antibody units and median fluorescence intensity for non-Vibrio cholerae O1 antigen multiplex bead assay markers

Relative antibody unit (RAU) measurements for each sample are plotted against the median fluorescence intensity (MFI) calculated from averaging triplicate measurements. Relative antibody units estimates were truncated at 10^2 and 10^5 (red points). LT-B = heat labile toxin B subunit. LT-H = heat labile toxin holotoxin.

Figure E5: Multiplex bead assay Net MFI measurements of IgG, IgM, and IgA against V. cholerae O1 antigens among culture confirmed cholera patients

The y-axis indicates the log (base 10) of the Net MFI and the x-axis is the number of days post-infection (square-root transformed). Each colored line indicates individual trajectories over time. The Black solid line is a loess smooth function.

Figure E6: Multiplex bead assay Net MFI measurements of IgG, IgM, and IgA against additional antigens among culture confirmed cholera patients

The y-axis indicates log (base 10) of the Net MFI and the x-axis is the number of days post-infection (square-root transformed). Each colored line indicates individual trajectories over time. The Black solid line is a loess smooth function.

Table E1: Biphasic and Exponential decay model comparison

The expected log-predictive density (ELPD) and standard error difference was calculated using the loo R package. Asterisks denote that the ELPD of the biphasic model (which includes more parameters) was over 1.96 standard errors larger than the ELPD of the exponential model.See Supplementary Appendix 2 for equations defining each model

Figure E7: Individual-level trajectories of Ogawa OSP IgG

Each facet shows the log10(RAU) measurements for individuals over time (points). Solid line indicates the median value of exponential decay model. Shaded area is the 95% credible interval.

Figure E8: Individual-level trajectories of Ogawa OSP IgA

Each facet shows the log10(RAU) measurements for individuals over time (points). Solid line indicates the median value of exponential decay model. Shaded area is the 95% credible interval.

Figure E9: Individual-level trajectories of Ogawa OSP IgM

Each facet shows the log10(RAU) measurements for individuals over time (points). Solid line indicates the median value of exponential decay model. Shaded area is the 95% credible interval.

Figure E10: Individual-level trajectories of CT-B IgG

Each facet shows the log10(RAU) measurements for individuals over time (points). Solid line indicates the median value of exponential decay model. Shaded area is the 95% credible interval.

Figure E11: Individual-level trajectories of CT-B IgA

Each facet shows the log10(RAU) measurements for individuals over time (points). Solid line indicates the median value of exponential decay model. Shaded area is the 95% credible interval.

Figure E12: Individual-level trajectories of CT-B IgM

Each facet shows the log10(RAU) measurements for individuals over time (points). Solid line indicates the median value of exponential decay model. Shaded area is the 95% credible interval.

Table E2: Estimated duration of half-life and average fold-change from univariate exponential decay models

The median estimate duration of half-life (in days) and the median estimate increase in fold-change for the average individual are shown below as well as in Figure 2. 95% Bayesian Credible Intervals are shown in parentheses.

Table E3: Relative parameter values for univariate exponential decay models including covariates for different serological markers

The median relative baseline value, median relative fold-change value, and median difference in days of half-life are reported. 95% Bayesian Credible intervals are shown in parentheses. Reference categories include male, non-O blood group, Inaba, and 10+ years old.

Figure E13: Estimated duration of half-life and average fold-change from exponential decay models fit to Net MFI measurements

Each point indicates the median estimate of the average individual fold-rise from baseline to peak (y-value) and the median estimate of the half-life (x-value) for exponential decay univariate models. Marginal 95% credible intervals are shown as lines. Model estimates for the vibriocidal assay are shown for reference and are identical across panels.

Figure E14: Comparison of cross-validated area under the ROC curve between ensemble and random forest models

Models were fit to 18 MBA markers and 3 demographic variables (age, sex, and blood type). The ensemble model was created using the R package SuperLearner. Four different types of models were combined into the ensemble: Random Forest models (ranger), Lasso and Elastic-Net Regularized Generalized Linear Models (glmnet), Bayesian Additive Regression Trees (bartMachine), and Extreme Gradient Boosting (xgboost). All models were unweighted.

Figure E15: Cross-validated area under the receiver operating characteristic curve (cvAUC) and predictor importance rankings for Net MFI markers across random forest models with varying infection windows

Estimates of mean cvAUC (10-fold) and 95% confidence interval are shown for weighted and non-weighted models between 50- and 600-day infection windows at 10-day intervals (A). Rug plot shows the day of collection of samples from cases used in training models. Samples collected under 5 days since infection, over 600 days since infection, or from household contacts are not shown. For each infection window of weighted models, the rankings of predictors by their importance are shown on the y-axis (B). Colors of lines are unique to each predictor.

Table E4: Comparison of cross-validated AUC between multiple marker random forest models using 45-day, 120-day, 200-day, and 300-day infection windows

Random forest models were fit using a the specified marker set and individual level factors including age, sex, and blood group. Mean and 95% confidence intervals for cvAUC are reported.

Table E5: Comparison of cross-validated AUC between multiplex bead assay IgG multiple marker random forest models using 45-day, 120-day, 200-day, and 300-day infection windows

Random forest models were using a reduced panel of MBA IgG markers and individual level factors including age, sex, and blood group. Mean and 95% confidence intervals for cvAUC are reported.

Figure E16: Comparison of cross-validated AUC across random forest models trained on traditional and Net MFI MBA serological markers for 45-day, 120-day, 200-day, and 300-day infection windows

Random forest models were fit using a specified marker set and individual level factors including age, sex, and blood type (A). Estimated mean and 95% confidence intervals for cvAUC are reported. Models fit to reduced panels of IgG MBA markers are shown (B). The order of how antigens were added was determined by the variable importance when fitting a model with only IgG MBA markers.

Figure E17: Specificity and time-varying sensitivity estimates of random forest models trained on traditional and Net MFI MBA serological markers with leave-one-out cross-validation for 45-day, 120-day, 200-day, and 300-day infection windows using different cut-offs

Median and 95% credible intervals are shown for the estimated (A) nominal specificity (black dashed line) and (B) time-varying sensitivity. Each row represents a different method for acquiring a cut-off including the Youden Index or maximizing sensitivity for a desired value of specificity. The relationship between logit(sensitivity) and time since infection (log-transformed) was constant for the 45-day window, linear for the 120-day, quadratic for the 200-day window, and cubic for the 300-day window. Traditional = vibriocidal Ogawa, vibriocidal Inaba, and 4 ELISA markers, All MBA = 18 MBA markers, All MBA IgG = 6 MBA markers, Reduced panel= Ogawa OSP, Inaba OSP, and CT-B IgG. All models also included age, sex, and blood type as predictors.

Figure E18: Comparison of cross-validated AUC across random forest models trained on traditional and MBA serological markers excluding anti-CT-B markers for 45-day, 120-day, 200-day, and 300-day infection windows

  1. Random forest models were fit using a the specified marker set and individual level factors including age, sex, and blood group. Estimated mean and 95% confidence intervals for cvAUC are reported. (B-D) Models fit to reduced panels of IgG, IgA, and IgM MBA markers are shown.